On the Parallelization of Subproduct Tree Techniques Targeting Many-Core Architectures

نویسندگان

  • Sardar Anisul Haque
  • Farnam Mansouri
  • Marc Moreno Maza
چکیده

We propose parallel algorithms for operations on univariate polynomials (multi-point evaluation, interpolation) based on subproduct tree techniques and targeting many-core GPUs. On those architectures, we demonstrate the importance of adaptive algorithms, in particular the combination of parallel plain arithmetic and parallel FFT-based arithmetic. Experimental results illustrate the benefits of our algorithms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

On The Parallelization Of Integer Polynomial Multiplication

With the advent of hardware accelerator technologies, multi-core processors and GPUs, much effort for taking advantage of those architectures by designing parallel algorithms has been made. To achieve this goal, one needs to consider both algebraic complexity and parallelism, plus making efficient use of memory traffic, cache, and reducing overheads in the implementations. Polynomial multiplica...

متن کامل

Efficient parallelization of the genetic algorithm solution of traveling salesman problem on multi-core and many-core systems

Efficient parallelization of genetic algorithms (GAs) on state-of-the-art multi-threading or many-threading platforms is a challenge due to the difficulty of schedulation of hardware resources regarding the concurrency of threads. In this paper, for resolving the problem, a novel method is proposed, which parallelizes the GA by designing three concurrent kernels, each of which running some depe...

متن کامل

Parallelization of Minimum Spanning Tree Algorithms Using Distributed Memory Architectures

Finding a minimum spanning tree of a graph is a well known problem in graph theory with many practical applications. We study serial variants of Prim’s and Kruskal’s algorithm and present their parallelization targeting message passing parallel machine with distributed memory. We consider large graphs that can not fit into memory of one process. Experimental results show that Prim’s algorithm i...

متن کامل

Inference of large phylogenetic trees on parallel architectures

Due to high computational demands, the inference of large phylogenetic trees from molecular sequence data requires the use of HPC systems in order to obtain the necessary computational power and memory. The continuous explosive accumulation of molecular data, which is driven by the development of cost-effective sequencing techniques, amplifies this requirement additionally. Furthermore, a conti...

متن کامل

Automatic parallelization for embedded multi-core systems using high level cost models

Nowadays, embedded and cyber-physical systems are utilized in nearly all operational areas in order to support and enrich peoples’ everyday life. To cope with the demands imposed by modern embedded systems, the employment of Multiprocessor System-on-Chip (MPSoC) devices is often the most profitable solution. However, many embedded applications are still written in a sequential way. In order to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014